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Abstract — We study the use of polar codes for both discrete 
and continuous variables Quantum Key Distribution (QKD). 
Although very large blocks must be used to obtain the efficiency 
required by quantum key distribution, and especially continuous 
variables quantum key distribution, their implementation on 
generic x86 CPUs is practical. Thanks to recursive decoding, 
they exhibit excellent decoding speed, much higher than large, 
irregular Low Density Parity Check (LDPC) codes implemented 
on similar hardware, and competitive with implementations of 
the same codes on high-end Graphic Processing Units (GPUs). 

Index Terms — QKD, polar codes, LDPC codes, GPU decoding. 



I. Introduction 

WITH Quantum Random Number Generators (QRNGs), 
QKD [1] is among the first industrial applications 
of quantum information technology. The two parties of a 
QKD protocol, Alice and Bob, exchange quantum signals 
through a physical (also called quantum) channel (such as light 
propagation through optical fibers or free-space propagation) 
and can extract a secret key, secure in the information-theoretic 
sense, even in the presence of an eavesdropper with unlimited 
computational power. 

Two families of QKD technologies exist: Discrete Variables 
(DV) QKD and Continuous Variables (CV) QKD. In both 
cases, the transmission of a binary message, the raw key, 
on a quantum noisy channel is at the heart of the protocol. 
Errors resulting from this transmission have to be corrected 
for Alice and Bob to be able to compute the same key. The 
quantum channels of DVQKD and CVQKD have different 
error distributions: in the DVQKD case, the channel is a 
Binary Symmetric Channel (BSC) whose probability of error 
is the Quantum Bit Error Rate (QBER). For CVQKD, it is a 
Gaussian channel with both a transmission T, and a Gaussian 
noise, composed of a quantum noise, the shot noise, and other 
classical noises which form the excess noise. 

When linear, non-interactive, error-correcting codes are 
used, the error correction algorithm uses the fact that the string 
sent satisfies some predefined set of linear equations where 
some linear combinations of message bits, or parity bits, are 
equal to zero. Transmission is therefore preceded by an encod- 
ing step where the message to be transmitted is transformed 
into a string that satisfies these equations. However, in the 
QKD setting, contrary to the usual setup of error-correcting 
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codes, a noiseless classical channel is available alongside the 
quantum noisy channel. Using this channel, the encoding step 
can be avoided: the message and the string sent are equal, 
and the values of the parity bits are revealed on the classical 
channel. Therefore the performance of the encoding step is 
not considered in our case. 

The limitations the error correction step introduces in the 
implementation of a QKD system are two-fold. First, the 
number of raw key bits or linear combination of raw key bits 
revealed during the error correction step must be subtracted 
from the final key size during the privacy amplification step 
0. Therefore efficient codes, i.e. codes with thresholds close 
to the Shannon Bound, are needed. Secondly, the throughput of 
the error-correction, which is usually not high because of the 
aforementioned efficiency constraints, may limit the final key 
rate below what is allowed by the optics. On the other hand, 
cost, power consumption, and latency constraints are much 
less of an issue than in typical error-correction applications. 

We propose to examine the relevance for QKD of a new 
family of codes, polar codes, introduced by Ankan (31. Based 
on our previous discussion, we will look at their distance to 
Shannon bounds and the decoding speed. For a given block 
size TV and a fixed channel, the polar decoding algorithm is de- 
terministic. Its execution time provably scales in 0(N log TV); 
it also has a simple recursive structure which gives good 
practical performance. However, we will see that very large 
blocks are required to achieve the high efficiencies needed for 
QKD on the BSC or the Binary Input Additive White Gaussian 
Noise Channel (BIAWGNC). 

The paper is organised as follows: in section [II] the impact 
of the imperfection of the error-correction procedure in both 
DVQKD and CVQKD is detailed and the previous work is 



reviewed. In section III the usage of polar codes to correct 



errors in a QKD setup is laid out. Finally the performances of 



polar codes and LDPC codes are compared in section IV 



II. Effect of an imperfect error correction step 

IN QKD 

A. Secret key rate and error correction 

1 ) Key rate and distance of error correction to Shannon 
bounds: In a classical DVQKD setup, Alice encodes a classi- 
cal bit onto the phase or the polarization of a photon and sends 
this photon to Bob who measures it with a Single Photon 
Detector (SPD) and gets a bit value. As regards CVQKD, 
Alice encodes continuous information onto the quadratures of 
the electromagnetic field and sends weak light pulses to Bob 
who performs either a homodyne measurement on one single 
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quadrature or a heterodyne measurement on both quadratures. 
In both cases, Bob ends up with a bit string, like in a DVQKD 
setup, because of the finite precision of its measurement 
apparatus. Since this step is repeated many times, Alice and 
Bob are given two bit strings x and y after the quantum 
exchange. 

The eavesdropper, Eve, has a quantum state E, generally 
correlated to x and y. If we assume Alice is chosen as the 
reference for the establishment of a secret key, the maximal 
secret information shared by Alice and Bob is given by 
S(x\E), which is the Von Neumann entropy of the variable 
x conditionally to Eve's knowledge (which is in general 
quantum). In order to compute an information-theoretic secret 
key rate, all the information corresponding to the errors 
between x and y, H(x\y) that is the conditional Shannon 
entropy of x given y, is assumed to be known by Eve and 
is subtracted from the final key. Thus the theoretical secret 
key rate reads: 

K th = S(x\E) ~ H{x\y) (1) 

This expression can be rewritten in terms of mutual infor- 
mations as: 

K th = I(x : y) - S(x : E) (2) 

According to the information theory, one can never extract 
the exact amount of mutual information I(x : y) between 
Alice and Bob with a finite error-correcting code. That is why 
one introduces a factor (3 which represents the reconciliation 
efficiency and ranges from when no information is extracted 
to 1 in the theoretical perfect reconciliation scheme: 

K real = pl{x : y) - S{x : E) (3) 

Thus an imperfect reconciliation scheme results in a reduc- 
tion of the secret key rate and a limitation of the range of 
the protocol. With all known protocols I(x, y) — S(x : E) 
decreases faster with the distance than I(x, y) and S(x : E) 
individually, so that the effect of < 1 is most severe at 
large distances. This last effect limited the range of CVQKD 
protocols for a long time before specific error correcting codes 
were proposed [i4j, (5). 

2) Key rate and error correction computation time: Long- 
range QKD therefore needs error-correcting codes and de- 
coding schemes enabling operation as close to the Shannon 
limit f) = 1 as possible. However, decoding close to the 
Shannon limit can be a computationnally demanding task; the 
computation time may then limit the throughput of a QKD 
experiment. In ifTTl . the raw optical repetition rate is 500 
kHz and the raw data rate reduces to 350 kHz because some 
pulses are used for synchronisation purposes and parameters 
estimation. Since the best reconciliation algorithm available in 
[17 1 is limited to about 63 000 symbols per second, only 18% 
of the available symbols can be used to extract secret keys. 
More generally, the key rate of a practical system is affected 
by a factor a = D E ccout/ D EC cin where D E ccout stands 



for the error-correction output rate (63 kb/s in our example) 
and D ECCin stands for the data output rate of the system used 
as an input for the error-correction (350 kb/s in our example). 

K sys = a(pI(x:y)-S(x:E)) (4) 

3) Key rate and error correction frame error rate: The 
frame error rate (FER), or the probability for a message to be 
incorrectly decoded, is usually one of the most regarded char- 
acteristics of an error-correcting code, since failure to decode 
a message is usually associated with data loss in conventional 
data transmission scenarios, at best causing retransmission 
delays. However, in the quantum key distribution setting, raw 
key blocks incorrectly decoded are simply discarded by both 
the sender and the receiver. As a result, the raw key rate and 
final key rate are affected by a factor (1 — FER). Frame error 
rates that are unacceptable in conventional error correction 
applications are therefore sufficient in the QKD case. Besides, 
accepting a high FER enables faster error correction. Our 
target figure in the rest of this article is a FER of 0.1. 

Taking into account all the previously discussed imperfec- 
tions of ECC in the QKD case, the final key rate is 

K = a(l- FER) (j3I(x : y) ~ S(x : E)) (5) 

B. Previous work 

Most of the error-correction algorithms designed especially 
for DVQKD, such as Cascade 0, 0, 0, Winnow or 
Liu's algorithm ifTUll suffer latency problems because they 
are highly interactive. Although the latest ones exhibit less 
interactivity than Cascade, it remains the algorithm most used 
in DVQKD experiments because it exhibits an efficiency 
higher than 96% IfTTl over the range [0;0.11] for the error 
probability of a standard Binary Symmetric Channel (BSC), 
which is the admissible range for the QBER to distribute a 
secret in DVQKD. The maximum reported Cascade speed is 
about 5.5Mb/s with 4 threads on a quad-core processor fl2l . 

Low Density Parity Check (LDPC) codes have also been 
developped for DVQKD experiments and have efficiencies 
similar with Cascade over the range [0; 0.02] while they 
present a significant improvement for bit error rates above 
0.02[11|. As regards interactivity, LDPC codes require only 
one exchange contrary to Cascade which is highly interactive. 
Since LDPC codes are optimized for a given probability error, 
puncturing and shortening techniques |[T3l can be used to 
extend their efficiency to a wider range and protocols allowing 
to reconcile information while maintaining a low interactivity 
have been proposed lfl4ll . fl5l . fl6l . However, high-efficiency 
LDPC error-correction speed has not been investigated a lot 
except for CVQKD where the authors of ifTTIl report a 40kb/s 
speed on CPU and a 60kb/s speed on GPU. 

Modern coding techniques have mainly been used for con- 
tinuous variables with Turbo-codes or LDPC codes. The main 
difficulty as regards continuous variables is that the best pro- 
tocols known require a Gaussian modulation while the noise 
added by the channel is Gaussian too. Thus, one has to deal 
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with an Additive White Gaussian Noise Channel (AWGNC) 
and high-efficiency error-correction is particularly hard at low 
Signal to Noise Ratios (SNRs) which correspond to a long 
operating distance for CVQKD. However, in [4|, the authors 
proposed a technique allowing to encode the information in 
binary variables which allows us to deal with a Binary Input 
(BI) AWGNC instead of the usual AWGNC. Low-rate high- 
efficiency multi-edge LDPC codes can be designed for this 
channel [3), iTTSll which results in a considerably extended 
achievable distance for CVQKD with a Gaussian modulation. 

III. Polar codes for QKD: efficiency vs. block 

SIZES 

The use of polar codes has been previously considered 
for other scenarii. In ||2TI . the authors show that the secrecy 
capacity of classical wiretap channels can be achieved using 
polar codes. This work was extended to quantum wiretap 
channels with a classical eavesdropper in |22|. In G3l . polar 
codes are used to transmit quantum information and an effi- 
cient decoder is provided for both Pauli channels and erasure 
channels. In l24l . it is shown that the Holevo capacity of 
lossy optical channels can be achieved with polar codes but an 
implementation of a quantum successive cancellation decoder 
is far beyond what can be experimentally realized today with 
quantum states. 

The QKD and wiretap channel scenarii are nevertheless 
different: in QKD, Alice and Bob's correlations are directly 
used to compute an upper bound on Eve's information without 
making any assumption on the channel between Alice and Eve, 
whereas in the wiretap channel scenario, the channel between 
Alice and Eve is assumed to be characterized. 

Polar codes exhibit some specificities that make them suit- 
able for QKD error correction. First, they are easily employed 
in a rateless setup where the noise of the channel can change 
over time. Secondly, they enable non-interactive error cor- 
rection, similarly to LDPC codes, and contrary to two-way 
protocols like Cascade. In this section, we evaluate the block 
sizes needed to obtain the efficiencies required for QKD. This 
impacts the decoding throughputs that can be obtained in 
practical implementations. 

In polar codes, individual copies of symmetric Binary Dis- 
crete Memoryless Channels (BDMC) are combined recursively 
in order to form a new set of channels composed of more 
and more differentiated channels, such that in the asymptotic 
limit channels are either error-free or completely noisy, with 
a fraction of error-free channels equal to the code capacity. 
This phenomenon is called channel polarization: each channel 
becomes either noiseless or noisy as the block length goes to 
infinity. In the asymptotic limit, the capacity of the BDMC 
can be acheived by sending the information bits through the 
noiseless channels, while in practice, only a fraction of this 
capacity is achieved using the bits with almost zero error 
probability for finite block lengths. The convergence speed of 
channels into noiseless or noisy channels is called polarization 
speed. 

We used the polar codes construction method described in 
[19 1 to compute the decoding error probabilities on symmetric 
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Fig. 1. Polar codes efficiency for the BSC for probability errors from 1% 
to 11% with a 1% step. The method described in 1191 is used to compute 
the capacities of each channel for a given noise level and the frozen bits are 
chosen in order to upper bound the FER by 0.1 according to this method. 
From the bottom to the top we used the following block sizes: 2 16 , 2 18 , 2 20 , 
2 22 , 2 24 . We can see that the efficiency is higher than the target efficiency 
of 95% over almost the entire range for block sizes equal to 2 24 . 

binary memoryless channels for the BSC and the BIAWGNC. 
For a given noise level on a given channel, Density Evolution 
allows us to compute the capacities of the different bits of 
the code. Some of the bits corresponding to channels with 
lowest capacities are simply revealed and are called the frozen 
bits of the code. As explained in [19|, this selection rule for 
frozen bits also gives us an upper bound on the decoding error 
probability of a block (also called the Frame Error Rate or 
FER). Since in QKD it is not crucial to lose some blocks 
(they will just be thrown away at the verification step), we 
select sets of frozen bits that give an upper bound of 0.1 
on the FER. It appears that the polarization speed is highly 
dependent on the channel for polar codes [|20} . Figure [T] gives 
the polarization speed we obtained for the BSC. It shows 
that polar codes have an efficiency above 95% over almost 
the entire probability error range [0; 0.11], which is the range 
of interest in DVQKD, for block lengths starting from 2 24 . 
Even smaller block lengths can be used if one does not need 
to cover the entire probability error range. The situation is 
definitely worse in Figure [2] for CVQKD. We studied the 
polarization speed for the SNRs described in Q because high 
efficiency multi-edge LDPC codes have been designed to deal 
with such noise levels 0, lfl8l . The results show that only a 
90% efficiency can be achieved with polar codes for blocks 
of size 2 27 whereas efficiencies of about 95% are achieved in 
|5| with LDPC codes. However long distance CVQKD is still 
possible using polar codes. Furthermore, there is still some 
hope to improve the polarization speed for polar codes for the 
BIAWGNC, for example by changing the recursive method 
used to combine channels, as proposed in |25|. 

IV. Decoding speed: numerical results 

An interesting feature of polar codes is their regular re- 
cursive structure. This allows us to implement a recursive, 
successive-cancellation decoder that achieves a speed of about 
lOMb/s on modern CPUs (Intel Core i5 670 3.47 GHz in 



4 




SNR 

Fig. 2. Polar codes efficiency for the AWGNC for the SNRs 1.097, 0.161, 
0.075, 0.029 from [:5j. The method described in [19] is used to compute the 
capacities of each channel for a given noise level and the frozen bits are 
chosen in order to bound the FER by 0. 1 according to this method. From the 
bottom to the top we used the following block sizes: 2 17 , 2 19 , 2 21 , 2 23 , 2 25 , 
2 27 . We can see that the efficiency is higher than the target efficiency of 90% 
over almost the entire range for block sizes equal to 2 27 . 



Channel 


QBER / SNR 


Size 




Speed (Mb/s) 


FER 


BSC 


2.0% 


2 iv 


92.9% 


7.3 


0.01 


BIAWGNC 


1.097 


2 20 


96.9% 


6.5 


0.09 


BIAWGNC 


0.161 


2 20 


93.1% 


7.1 


0.04 



TABLE II 

LDPC CODES DECODING SPEEDS WITH LDPC CODES DESCRIBED IN ITTi 
FOR THE BSC AND IN flSH FOR THE BIAWGNC. THE MAXIMUM 
NUMBER OF ITERATIONS WAS FIXED TO 20 FOR THE FIRST CODE, AND 
RESPECTIVELY TO 160 AND 100 FOR THE NEXT TWO CODES. THESE 
FIGURES WERE OBTAINED WITH AN AMD TAHITI GRAPHICS PROCESSOR. 



Channel 


QBER / SNR 


Size 


/8 


Speed (Mb/s) 


FER 


BSC 


2.0% 


2 iv 


93.1% 


0.82 


0.03 


BIAWGNC 


1.097 


2 20 


96.9% 


0.09 


0.03 


BIAWGNC 


0.161 


2 20 


93.1% 


0.12 


0.04 



TABLE III 

LDPC CODES DECODING SPEEDS WITH LDPC CODES DESCRIBED IN ITTI 
FOR THE BSC AND IN flSH FOR THE BIAWGNC. THE MAXIMUM 
NUMBER OF ITERATIONS WAS FIXED TO 15 FOR THE FIRST CODE, AND 

RESPECTIVELY TO 100 AND 50 FOR THE NEXT TWO CODES. THESE 
FIGURES WERE OBTAINED WITH ONE CORE OF AN INTEL CORE l5 670 
3.47GHZ PROCESSOR. 



Channel 


QBER / SNR 


Size 


(8 


Speed (Mb/s) 


FER 


BSC 


2.0% 


2 lti 


93.5% 


10.9 


0.09 


BSC 


2.0% 


2 20 


96.3% 


9.5 


0.11 


BSC 


2.0% 


2 24 


98.0% 


8.3 


0.08 


BIAWGNC 


1.097 


2 24 


95.2% 


8.0 


0.10 


BIAWGNC 


0.161 


2 27 


92.8% 


7.3 


0.09 



TABLE I 

Polar codes decoding speeds on the BSC and the BIAWGNC. 
The efficiencies correspond to a block error rate of 0.1 when 
selecting the frozen bits according to the method described 
in lfT9l . These figures were obtained with one core of an Intel 
Core i5 670 3.47GHz processor. 



the simulations). The main optimization in this decoder is to 
use fixed-point arithmetic and a table-lookup implementation 
of the function tp{x) = log(tanh(x/2)) used to update log- 
likelihood ratios (LLRs). Other techniques have been proposed 
for efficient polar codes decoding and could improve the 
decoding speeds given in Table [I] in E6ll . the authors propose 
look-ahead techniques that allow to reduce the decoding 
latency of successive cancellation by 50% while in [27|, 
[28 1, l29l . some variants of list decoding for polar codes are 
introduced. 

The polar decoding performance has to be compared with 
the speed of a LDPC decoder based on BP. The speed of such a 
decoder dramatically lowers when approaching the capacity of 
the code used because BP requires more iterations to converge. 
Thus LDPC decoding speed is limited to about 800kb/s using 
one core of a modern CPU. The LDPC CPU decoder uses 
fixed-point arithmetic and the same implementation of if as 
in the polar code case. It is a shuffle decoder with an early 
termination strategy where bits are considered to be known 
(and their LLR ceases to be updated) when the abosulte value 
of their LLR passes a threshold; when no bit is updated for a 
sufficient number of iterations, decoding is considered to be 
over and is stopped. Because the regime explored is close to 
the Shannon limit, simplified BP algorithms such as min-sum 



or its variants cannot be used. Finally, the maximum number 
of iterations is controlled to adjust the FER to the target value 
0.1. This control is imprecise however, since small variations 
of the maximum allowed number of iterations result in large 
FER changes. The maximum number of iterations used for 
LDPC codes are given in Table |H| and Table [ITT] legends. 

GPUs provide a huge amount of parallelism that allows 
us to achieve speeds of lOMb/s (figures are given for an 
AMD Tahiti Graphics Processor). The GPU LDPC decoder 
is different from the CPU implementation: it is a floating- 
point, flood decoder running in a fixed number of iterations and 
using both 'external' parallelism (several vectors are decoded 
concurrently) and 'internal' paralelism (for a single BP exe- 
cution corresponding to one message being decoded, several 
messages are propagated concurrently). This was experimen- 
tally found to be optimal on GPU architectures because they 
have much more floating point computational power than 
CPUs, but are slowed down by complex control logic. No 
competitive GPU decoder for polar codes was implemented, as 
successive cancellation is inherently sequential, and therefore 
only external parallelism can be used. 

Table |TJ gives the decoding speeds obtained with polar codes 
for the BSC and the BIAWGNC for characteristic noise levels 
in DVQKD and CVQKD. Table [ffl] and Table [IT] give the 
corresponding speeds with LDPC codes respectively with a 
GPU and a CPU. 

The best reported QKD key rate is about IMb/s fl2l . 
PUll (which is several order of magnitudes below state-of- 
the-art optical communication links that range from lGb/s 
to lOOGb/s). This means that even using large blocks as in 
Table [TJ polar codes decoding throughput is enough for state- 
of-the-art QKD implementations. 

V. Conclusion 

We showed that polar codes can be used to perform the error 
correction step for both DVQKD and CVQKD. They achieve 
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good efficiencies for the BSC and BIAWGNC for level of 
noises compatible with QKD. However, since the polarization 
speed of polar codes is worse for the BIAWGNC than for the 
BSC, they require higher block sizes and are less practical for 
CVQKD than for DVQKD. 

As regards the decoding step, which is often a bottleneck 
in recent QKD implementations, we showed that polar codes 
feature high-speed recursive decoding and achieve CPU de- 
coding speeds similar to LDPC GPU decoding speeds. This is 
to our knowledge the first practical application of polar codes. 
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